Skip to main content

Data

In the Data tab, participants can provide the seed and base audiences and see the overlap results.

As soon as both seed and base audiences have been provided, the computations to calculate the overlap, insights statistics and audiences are started. Depending on the dataset sizes, this may take several hours. Note that this also applies to changes to the seed and base audiences which trigger recomputations.

View overlap

note

To see the overlap, you need the "View overlap" permission.

After base and seed audiences have been provided (see below), the overlap calculation is triggered. After it has completed, the overlap chart is shown. It is possible to select the seed audience (audienceType) for which the chart is displayed. Overlap statistics are only shown if they include at least 100 users. Seed audiences (audienceType) which are too small cannot be selected.

Providing the seed audiences

note

To provide the seed audience, you need the "Provide seed audience" permission.

The seed audience is a tabular dataset of users that should be matched with the base audience.

The seed audience should include two columns

  • matchingId: This is the identifier to be matched with the base audience. The type of this must match the matching ID type of the Media DCR.
  • audienceType: Multiple seed audiences can be uploaded by grouping them in the audienceType column. Typically this column is used to group customers based on their interactions with advertisers or based on segments provided by data partners. Note that a given matchingId can appear in multiple audienceType rows, but that the combination of matchingId + audienceType must be unique; duplicates are removed automatically.

When clicking on Provision dataset, it is possible to upload a file directly or use an already uploaded dataset. The latter is recommended for automated pipelines or large datasets; see Data management for details. Note that for direct upload, it is possible to upload a single-column file of matchingId; the platform fills audienceType with a default value.

Example seed audience dataset

matchingId (matching ID type: Hashed email)audienceType
e4191e26a5d04a3d8ae9008189a0db7168aba7e8b92944ae90c6…Sneaker Campaign
fb3f59dbe51d42f3bf4f826ac5c22f1f13d2d9e464804539ad59…Sneaker Campaign
fb3f59dbe51d42f3bf4f826ac5c22f1f13d2d9e464804539ad59…Athleisure Campaign
........

Providing the base audience

note

To provide the base audience, you need the "Provide base audience" permission.

Up to four datasets together form a datalab for the base audience. To learn how to create a datalab, please refer to the additional documentation shared with you by your customer success representative.

A datalab consists of the following datasets:

DatasetRequiredKey columns (in order → meaning)
MatchinguserId → publisher internal ID, matchingId → join identifier (hashed e-mail, etc.) matching the DCR’s matching ID type
SegmentsuserId, segment → labels that classify the user for insights and targeting
DemographicsOptionaluserId, age, gender → demographics for richer analysis
EmbeddingsOptionaluserId, scope → additional grouping of embeddings, emb_x → one or more numeric embedding columns

Example datasets

Matching

userIdmatchingId (hashed e-mail)
ab0dc82c-f120-49d1-82d4-0ab994e8410c3402d61a92a47021279b8b0d3625a6e84142f5352d381…
7913b04c-5457-4d18-828d-46e6395428ab9cf17fbe88caad4715c1d7f2cc44901d28eb15bfa561…
........

Segments

userIdsegment
ab0dc82c-f120-49d1-82d4-0ab994e8410cNews_Reader
7913b04c-5457-4d18-828d-46e6395428abSports_Fan
........

Demographics (optional)

userIdagegender
ab0dc82c-f120-49d1-82d4-0ab994e8410c25-34M
7913b04c-5457-4d18-828d-46e6395428ab35-44F
............

Embeddings (optional)

userIdscopeemb_1emb_2...
ab0dc82c-f120-49d1-82d4-0ab994e8410cnightlynews0.120.83...
7913b04c-5457-4d18-828d-46e6395428abnightlynews0.450.66...
...............